privacy-preserving public release
Privacy-Preserving Public Release of Datasets for Support Vector Machine Classification
This paper proposes a novel method for privacy-preserving release of an entire dataset while maintaining useful properties, such as statistics required for reconstructing a support vector machine classifier. This is done by balancing privacy and utility guarantees using an explicit optimization problem. The dataset is systematically obfuscated using an additive noise and the inverse of the trace of the Fisher information matrix is used as a measure of privacy for the entries of the dataset. By the use of the Cramér-Rao bound [3, p. 169] The use of the Fisher information matrix makes the privacy metric independent of the sophistication of the adversary, thus making it a universal measure of privacy. Further, the Cramér-Rao bound provides a practical/operational interpretation of the measure of privacy to the data owners, i.e., how much someone can learn about an individual in the dataset based on the publicly released obfuscated data.